Skip to content

[SPARK-9834][MLLIB] implement weighted least squares via normal equation#8588

Closed
mengxr wants to merge 8 commits into
apache:masterfrom
mengxr:SPARK-9834
Closed

[SPARK-9834][MLLIB] implement weighted least squares via normal equation#8588
mengxr wants to merge 8 commits into
apache:masterfrom
mengxr:SPARK-9834

Conversation

@mengxr

@mengxr mengxr commented Sep 3, 2015

Copy link
Copy Markdown
Contributor

The goal of this PR is to have a weighted least squares implementation that takes the normal equation approach, and hence to be able to provide R-like summary statistics and support IRLS (used by GLMs). The tests match R's lm and glmnet.

There are couple TODOs that can be addressed in future PRs:

  • consolidate summary statistics aggregators
  • move dspr to BLAS
  • etc

It would be nice to have this merged first because it blocks couple other features.

@dbtsai

@SparkQA

SparkQA commented Sep 3, 2015

Copy link
Copy Markdown

Test build #41977 has finished for PR 8588 at commit 34107aa.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class Instance(w: Double, a: Vector, b: Double)

@SparkQA

SparkQA commented Sep 3, 2015

Copy link
Copy Markdown

Test build #41979 has finished for PR 8588 at commit c75ff92.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class Instance(w: Double, a: Vector, b: Double)

@mengxr mengxr changed the title [WIP][SPARK-9834][MLLIB] implement weighted least squares via normal equation [SPARK-9834][MLLIB] implement weighted least squares via normal equation Sep 4, 2015
@SparkQA

SparkQA commented Sep 4, 2015

Copy link
Copy Markdown

Test build #41994 has finished for PR 8588 at commit 1614f22.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class Instance(w: Double, a: Vector, b: Double)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need standardizeLabel? I think without regularization, with/without standardization will not change the solution.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need it but I think it is useful to list the values explicitly here.

@feynmanliang

Copy link
Copy Markdown
Contributor

LGTM, did not check low level implementation

@feynmanliang

Copy link
Copy Markdown
Contributor

jenkins test this please

@SparkQA

SparkQA commented Sep 8, 2015

Copy link
Copy Markdown

Test build #42145 has finished for PR 8588 at commit c2ec746.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • case class Instance(w: Double, a: Vector, b: Double)

@mengxr

mengxr commented Sep 9, 2015

Copy link
Copy Markdown
Contributor Author

Merged into master. I will make follow-up PRs to do the refactoring.

@asfgit asfgit closed this in 52fe32f Sep 9, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants